The CAC 40, the main stock market index of the Paris stock exchange, is a key indicator of the state of the French and European economy. Bringing together the 40 largest French market capitalizations listed on Euronext Paris, this index provides a representative synthesis of the performance of leading companies in strategic sectors such as industry, finance, and technology. It plays a central role in investor decision-making, economic cycle analysis, and assessing the competitiveness of French companies on a global scale.
In an economic context marked by the COVID-19 pandemic, rising interest rates, and geopolitical tensions, it becomes crucial to examine the key factors driving fluctuations in the CAC 40. This project aims to achieve two main objectives: first, to analyze the sectoral structure of the index to identify internal dynamics affecting its performance; and second, to study the correlations between the CAC 40 and macroeconomic variables such as interest rates and currency fluctuations.
To this end, we will combine traditional statistical approaches and machine learning tools, enriched with graphical visualizations and predictive models. This methodology will provide a comprehensive and rigorous analysis of the phenomena under study. Finally, a reflection on the practical implications of these results will be conducted, offering insights for both investors and financial analysts to inform their strategic decisions.
The files used contain economic, financial, and stock market information necessary to analyze variations in the CAC 40. Below is an overview of the files and their contents:
The Euribor rates and unemployment rate were not integrated into the
daily_data table, as they are reported at monthly and
annual frequencies, respectively, unlike the variables in
daily_data.csv, which are recorded on a daily basis.
| File Name | Number of Rows | Number of Columns |
|---|---|---|
| cac40_composition | 40 | 11 |
| euribor_3m_rate | 371 | 2 |
| france_unemployment_rate | 49 | 2 |
| daily_data | 8767 | 5 |
The next step is to adjust the variables based on their meaning. Most
of them are imported as character strings (chr). Therefore,
it is necessary to convert dates to the Date format,
numbers to numeric, etc.
After adjusting the data types to ensure consistency with the intended analyses, we provide below a detailed description of the variables present in each dataset, specifying their role and meaning within the scope of this study.
| Dataset | Variable Name | Type | Description |
|---|---|---|---|
| cac40_composition | company_name | character | Name of the listed company |
| cac40_composition | sector | character | Company’s business sector |
| cac40_composition | stock_price | numeric | Company’s stock price |
| cac40_composition | Ticker | character | Company’s ticker symbol (stock symbol) |
| cac40_composition | shares_outstanding_millions | numeric | Number of shares in millions |
| cac40_composition | capitalization_in_billion | numeric | Market capitalization in billions |
| cac40_composition | proportion_ds_indice | numeric | Weight in the CAC 40 index |
| cac40_composition | dividend_2023_in_millions | integer | Dividends paid in 2023 (in millions) |
| cac40_composition | part_dividende_2023 | numeric | Proportion of total dividends in 2023 |
| cac40_composition | stock_price_5jan | numeric | Stock price on January 5th |
| cac40_composition | stock_price_30dec | numeric | Stock price on December 30th |
| euribor_3m_rate | EURIBOR_3M_Rate | numeric | Recording date |
| euribor_3m_rate | Date | Date | 3-month Euribor rate |
| france_unemployment_rate | Date | Date | Recording year |
| france_unemployment_rate | unemployment_rate | numeric | Annual unemployment rate |
| daily_data | Date | Date | Recording date |
| daily_data | euro_usd | numeric | EUR/USD exchange rate |
| daily_data | sp500_price | numeric | S&P 500 index price |
| daily_data | cac40_price | numeric | CAC 40 index price |
| daily_data | urw_price | numeric | Unibail-Rodamco-Westfield stock price |
The data used in this study was collected over a 23-year period with daily granularity, limited to business days. While necessary to reflect financial market dynamics, this characteristic inevitably results in missing values for certain days, particularly due to public holidays or periods without trading.
To maintain the integrity of the time series and avoid significant data loss, we opted for linear interpolation of missing values. This method estimates missing values based on adjacent data, ensuring continuity in the analysis while minimizing potential distortions.
This methodological choice is justified by the length of the study period (23 years), which helps mitigate the impact of estimates on overall conclusions. Moreover, systematically removing rows with missing values would have significantly reduced the volume of usable data, risking biased results or limiting the scope of the analysis.
| Variable | Number of Missing Values |
|---|---|
| Date | 0 |
| euro_usd | 0 |
| sp500_price | 0 |
| cac40_price | 0 |
| urw_price | 0 |
En tant qu’indice de référence de la place de Paris, le CAC 40 regroupe les quarante principales entreprises cotées sur Euronext Paris. La composition sectorielle de cet indice met en lumière une forte concentration autour de quelques secteurs spécifiques, en faisant un reflet pertinent des dynamiques économiques françaises.
The chart above illustrates the sectoral distribution of the CAC 40, showing a significant predominance of the consumer sector (39.6%) and financial services (22.3%). This structure reflects the central role of these sectors in the French economy, both nationally and internationally. In contrast, the technology sector, while strategic, represents a modest 4.8%, highlighting its still limited weight compared to other international indices, such as the Nasdaq, which includes major companies like Google, Apple, Microsoft, and Meta.
Regarding profit redistribution, the donut chart highlights the significant contribution of the financial sector (29.8%) and the consumer sector (18%) to dividends paid to shareholders. This contribution contrasts with their relative weight in the index, underscoring their predominant role in creating value for investors. Conversely, the technology and utilities sectors are characterized by more modest contributions, suggesting a strategic focus on reinvestment.
The joint analysis of sectoral composition and dividend distribution sheds light on the performance drivers of the CAC 40. It also reveals the differentiated strategic choices of companies by sector—between maximizing shareholder value and prioritizing long-term development.
The companies that make up the CAC 40 show highly heterogeneous weights, resulting in varied contributions to the weekly fluctuations of the index. A detailed analysis of relative contributions shows that certain companies, particularly large-cap firms such as LVMH or TotalEnergies, exert a disproportionately strong influence on the overall performance of the index.
A mosaic chart representing the contribution of companies highlights this dynamic. This visualization clearly shows that the 5 to 10 most influential companies are often responsible for a significant portion of the variations observed in the CAC 40.
To complement this analysis, a bar chart allows for a more precise quantification of individual contributions. This representation provides a useful tool for identifying companies whose impact, while less visible at first glance, remains significant in weekly fluctuations.
The performance of CAC 40 companies cannot be isolated from the economic dynamics of the regions in which they operate. A geographic analysis of regional contributions in 2023 highlights the dominant influence of European and North American markets. The United States, in particular, benefits from strategic trade agreements that facilitate market access for many groups, strengthening its contribution to the index’s overall revenues. Moreover, Asian regions play a significant role, especially in the luxury sector, where they remain a key driver of growth for groups like LVMH and Hermès. This importance is also reflected in the international activities of large companies such as L’Oréal and Airbus. These economic zones capture a substantial share of the revenues generated by the index’s companies, illustrating the fundamental role of global dynamics in their results.
To further refine the analysis, a Ridgeline plot of weekly variations by sector offers an overview of the specific distributions for each industry. For instance, the technology sector stands out for its increased volatility, while the consumer and financial services sectors show distributions more concentrated around their mean. These characteristics may reflect differentiated investment behaviors or varying sector sensitivities to economic shocks.
These combined visual representations reveal the complexity of interactions between companies, sectors, and regions in the overall performance of the CAC 40. They provide key insights for identifying performance drivers and strategic opportunities for investors.
Stock market indices, especially the CAC 40, react strongly to major economic events that significantly influence their returns. A comparative analysis of returns before and after these events highlights characteristic variations depending on the type of crisis. For example, the global financial crisis caused significant losses followed by a gradual recovery.
The study of daily return distributions provides valuable insights into the underlying dynamics of these variations. The histogram of daily returns for 2023 illustrates a symmetric distribution centered around zero, indicating overall stability. However, the presence of fat tails in this distribution highlights an increased frequency of extreme returns, often caused by exogenous shocks.
The assumption of normality, verified using a QQ-plot, reveals that although CAC 40 returns generally follow a normal trend, significant deviations appear in the tails. These deviations underscore the index’s sensitivity to extreme events, a characteristic of a slightly leptokurtic distribution. This observation is crucial for financial risk management and return modeling, where accounting for rare but impactful events is essential.
These analyses demonstrate the importance of integrating robust and tailored methodologies to understand the specific characteristics of volatility and observed distributions in financial markets.
The analysis of daily returns in 2023 highlights a complex dynamic, marked by significant fluctuations at times. A graphical representation in the form of a heatmap of daily variations clearly illustrates this volatility: periods of sharp decline (marked in red) are mainly concentrated in the first quarter, often linked to global macroeconomic uncertainties. In contrast, positive peaks (in green) indicate rebound periods, generally associated with favorable announcements or market adjustments.
To deepen the analysis, a monthly boxplot of daily returns reveals significant dispersion in the first half of the year, with fat tails in the histogram indicating high volatility. In contrast, the second half is characterized by a gradual stabilization, with returns whose medians converge toward zero, reflecting a certain calm.
These observations highlight a generally symmetric behavior of returns, but punctuated by extreme values, requiring increased vigilance from investors. The distribution of returns, while approaching a normal curve, remains influenced by exogenous shocks, emphasizing the importance of integrating these rare events into financial models.
Major exogenous events, such as Brexit or the 2008 financial crisis, induce significant disruptions in the returns and volatility of the CAC 40 index. Analyzing these periods helps decipher how financial markets respond to uncertainty and reveals the underlying dynamics.
The evolution of average daily returns around Brexit clearly illustrates the impact of this event on the markets. Before the official announcement (blue dashed line), returns were characterized by relative stability, with moderate variations. However, the Brexit announcement triggered a sharp drop followed by a rapid rebound. This behavior reflects heightened short-term volatility, typical of periods of major political uncertainty. This phenomenon, amplified by portfolio adjustments in response to unexpected scenarios, underscores the importance of economic agents’ anticipations.
Regarding the long-term evolution of volatility, the boxplot of monthly volatilities of CAC 40 returns highlights major peaks during global crises, particularly in 2008 (global financial crisis) and 2020 (COVID-19 pandemic). These periods of extreme turbulence resulted in amplified return variations, direct reflections of economic uncertainty and massive portfolio adjustments by investors. Conversely, the years following these crises show a gradual stabilization trend in volatility levels, indicating a progressive return to market normality.
These analyses reveal that, although generally stable, financial markets remain vulnerable to unforeseen exogenous shocks. Integrating these risks into investment strategies is essential, requiring a rigorous assessment of the intensity and duration of each event. These elements are fundamental prerequisites for developing robust models capable of capturing future fluctuations and mitigating the effects of volatility.
The analysis of the relationship between CAC 40 index returns and 3-month Euribor rates reveals a slightly negative correlation (-0.13), highlighted by the downward slope of the regression line. This result contradicts the expected positive correlation during periods of economic recovery, where rising Euribor rates reflect strengthening economic activity. Here, the weak negative correlation suggests that the impact of Euribor rates on CAC 40 returns is marginal in a broader context.
## [1] "Correlation coefficient: -0.13"
However, segmenting the analysis by economic cycles reveals important nuances. During crises, such as the 2008 recession or the European sovereign debt crisis (2011-2012), the influence of Euribor rates on returns becomes more pronounced. This can be explained by accommodative monetary policies during crises that lower Euribor rates, supporting liquidity and boosting investor confidence. In contrast, during post-crisis phases, the relationship becomes less distinct, as investors focus more on economic fundamentals and corporate performance.
These results highlight the complexity of interactions between monetary policies and financial markets. They also emphasize the importance of considering the economic context and market cycles to understand how interest rates impact stock indices. This approach helps refine investment strategies and anticipate the future evolution of the CAC 40 in various economic environments.
The relationship between the EUR/USD exchange rate and CAC 40 index variations shows a negative correlation of -0.24, indicating that an appreciation of the euro against the U.S. dollar tends to be associated with a decline in the French stock index. This phenomenon is based on fundamental economic mechanisms affecting companies in the CAC 40.
On one hand, a rising euro reduces the competitiveness of French exports, especially in markets where the dollar is the reference currency, such as North America. Large exporting companies like Airbus and LVMH see their products become more expensive for local consumers, potentially reducing their market share. On the other hand, a strong euro also decreases profits earned in dollars when converted to euros, affecting multinationals that generate a significant portion of their revenues internationally.
During the 2008 financial crisis, for example, the euro reached 1.6 USD/EUR, coinciding with a sharp decline in the CAC 40 index, exacerbated by a contraction in global demand. Conversely, during the European sovereign debt crisis (2011-2012), a depreciation of the euro boosted French exports, strengthening the performance of CAC 40 companies.
However, the graphical analysis also reveals significant dispersion in the data points, reflecting the influence of other factors such as macroeconomic conditions, monetary policies, or sector-specific dynamics. Thus, while the exchange rate is a crucial factor, it must be considered within a more complex framework where multiple dynamics affect corporate and stock index performance.
The correlation matrix is an essential analytical tool, providing a
synthetic view of linear relationships between key economic and
financial variables influencing the evolution of the CAC 40 index. The
analysis reveals several significant trends, notably a strong positive
correlation with the S&P 500 index. This connection reflects the
interconnectedness of global stock markets, particularly during periods
of crises or global economic recoveries, such as the 2008 financial
crisis or the 2020 pandemic. These events illustrate how major indices
synchronize in response to macroeconomic shocks.
In contrast, the negative correlation observed between the CAC 40 and the EUR/USD exchange rate reflects an inverse dynamic. An appreciation of the euro, as an unfavorable factor for French exporting companies, tends to weigh on the index’s performance. This phenomenon can be explained by reduced price competitiveness for French companies in international markets, directly affecting their profit margins.
Furthermore, a moderate correlation between 3-month Euribor rates and the CAC 40 highlights the indirect influence of interest rate variations on investor expectations. The European Central Bank’s (ECB) monetary policies play a central role here: for example, the rate cuts in 2012, aimed at supporting the recovery after the sovereign debt crisis, helped boost European stock markets.
Finally, a significant negative correlation is identified between the CAC 40 and the unemployment rate. High unemployment, an indicator of unfavorable economic conditions, is often associated with weaker stock performance, reflecting stagnant domestic demand, declining corporate profits, and eroded investor confidence. This link was particularly evident during the economic tensions in the eurozone.
These correlations illustrate the complexity of interactions between economic and financial variables. They reinforce the importance for investors to take these relationships into account in their analyses to anticipate index movements in various economic contexts.
This analysis aims to predict CAC 40 index variations using a simple linear regression model based on key macroeconomic variables, namely:
These explanatory variables were chosen for their economic relevance and proven correlation with CAC 40 fluctuations, as demonstrated by previous financial market research. The main objective is to quantify the influence of these factors on CAC 40 prices and evaluate the feasibility of generating reliable predictions to anticipate future trends.
To ensure the quality of the data used in this analysis, a rigorous preprocessing process was conducted:
1) Handling Missing Values: An initial inspection confirmed the absence of significant missing values, ensuring the robustness of the results.
2) Selection of Numerical Variables: Only relevant quantitative variables were retained for the analysis, as illustrated in Figure 1, which displays the distribution of the variables.
The graphs highlight several characteristics of the explanatory
variables and the target variable (CAC40_Price):
A multi-modal distribution for the CAC40 and S&P 500, reflecting periods marked by crises or rapid economic recoveries.
A notable concentration of EUR/USD rates around 1.2, indicating relative stability over the studied period.
Significant heterogeneity in 3-month Euribor rates, illustrating the impact of monetary policies over time.
These preliminary observations justify the integration of these macroeconomic variables as the foundation for the predictive model of the CAC 40.
Before constructing the linear regression model, a correlation matrix was calculated to explore the relationships between the explanatory variables and the CAC 40. The results highlight several significant relationships:
A strong positive correlation with the S&P 500 index, confirming the interdependence of global financial markets.
A moderate correlation with the EUR/USD rate, reflecting the impact of currency fluctuations on the companies within the index.
A negative correlation with the 3-month Euribor rate, indicating an inverse influence of credit conditions on stock performance.
A weak correlation with the unemployment rate, suggesting that this indicator has a limited short-term impact on index variations.
These results confirm the relevance of the selected explanatory variables for modeling while also highlighting potential limitations related to variables with weak correlations to the CAC 40.
To analyze variations in the CAC 40 index, we specified a linear regression model based on the following formula:
\[ CAC40\_Price \sim EUR\_USD\_Price + SP500\_Price + taux\_chomage + EURIBOR\_3M\_Rate \]
This choice is based on the economic relevance of the selected explanatory variables:
The estimated coefficients reveal several significant relationships:
While the model captures major linear relationships, it does not fully account for the complex dynamics characterizing the market.
To assess the robustness of the model, cross-validation was conducted. The data was split into two subsets:
The model’s performance was evaluated using several metrics:
## # A tibble: 3 × 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 rmse standard 488.
## 2 rsq standard 0.772
## 3 mae standard 407.
These results indicate satisfactory performance, although the model still has room for improvement, particularly in reducing errors during periods of high volatility.
The graph comparing actual and predicted values illustrates the performance of the linear regression model for the CAC 40. The main observations are as follows:
General Trends Well Captured:
The model effectively reproduces the major dynamics of the index,
particularly during prolonged phases of growth or decline.
Localized Deviations:
Certain divergences appear during periods of high volatility, such as
financial crises or peaks of uncertainty. For instance, abrupt
fluctuations during crisis periods are not fully anticipated by the
model.
Moderate Accuracy in Stable Periods:
In the absence of marked volatility, the model’s predictions tend to
converge more closely with observed values, indicating robustness under
normal market conditions.
These results demonstrate the model’s ability to provide reliable medium-term predictions while revealing its limitations when faced with sudden and extreme variations.
Several inherent limitations of the model have been identified:
Introduction of New Variables
To refine predictions, new explanatory variables could be added:
External Variables:
- Commodity prices (oil, gold)
- Volatility indicators, such as the VIX index
- Detailed sectoral data, including the performance of key sectors in
the CAC 40
Internal Variables:
- Foreign capital flows
- Earnings announcements of companies within the index
Exploration of Advanced Models
Testing non-linear approaches to better capture the complexity of
relationships between variables:
These techniques may better capture complex and non-linear relationships between explanatory variables and the CAC 40.
Time Series-Specific Modeling
Adopting models tailored to temporal data:
This project highlighted the factors influencing CAC 40 fluctuations through an approach combining statistical tools, graphical analyses, and machine learning models. The study of correlations between the CAC 40 and key macroeconomic variables, such as the EUR/USD exchange rate, the S&P 500, 3-month Euribor rates, and the unemployment rate, revealed complex interactions reflecting global economic and financial dynamics.
The linear regression model demonstrated a notable ability to capture the index’s general trends under stable market conditions. However, it remains limited when facing abrupt variations or high volatility contexts, offering opportunities for future improvements.
The results of this project provide interesting perspectives for practical applications. For example, developing an interactive model could allow for simulating hypothetical scenarios, such as a sudden rise in Euribor rates or significant variations in the S&P 500. Integrating these features into an R Shiny interface would enable real-time prediction exploration, enhancing their usefulness for investors.
Additionally, predictions from this model could be used to optimize investment strategies. By combining these predictions with fundamental analyses, it would be possible to design active portfolio management approaches based on anticipated signals and robust models.
For future research, the project offers pathways to deepen the analysis. Extending the methodology to other international indices, such as the DAX or the FTSE, could lead to enriching comparisons and a better understanding of the specificities of different markets. Exploring multivariate models that consider interactions between multiple indices and macroeconomic variables could also improve predictions by capturing more complex global dynamics.
In conclusion, this project provides a solid foundation for understanding CAC 40 variations and anticipating its future trends. The identified improvement perspectives—especially through the introduction of new explanatory variables and the use of advanced models—pave the way for even more precise and useful analyses. These efforts will help develop powerful predictive tools that assist investors in navigating an ever-evolving economic and financial environment.
Guibert, Quentin.
Data Visualization Course. Moodle Université Paris-Dauphine
PSL. Accessed on January 6, 2025.
Data on CAC40, S&P500, and
Unibail-Rodamco-Westfield.
Available at: https://www.investing.com.
Data on Euribor Rates and Euro/USD Exchange
Rates.
Available at: https://fred.stlouisfed.org.
Data on the Unemployment Rate in France.
Available at: https://data.ecb.europa.eu.
Les Echos.
Regional Revenue Proportions of CAC 40. Available at: https://www.lesechos.fr.